DOCOUBNT BBSOHE 

ED 111 871 * ' . 004 839 



&UTBOB 
TITLE 

PUB DATE^ 
NOTE 



EDHS PRICE 
DESCRIPTORS 



ABSTRACT 



Poole^ Richard L. 

Student Victiaization and the Poraulation of Test • 
Construction Criteria* ^ 
[Apr 75] V . 

18p.; Paper ^presented at the Annual Meeting of the. 
National Council on Heasurement in. Education 
(Hashington^ D.C.^ March 31-April 1975) 

HF-$0.76 HC-$1.58 Plus Postage ' ^ 
Elementary Secondary Education; *Evaluation Criteria; 
Grading; Higlier Education; *Stu>dent Attitudes; 
♦Student Testing; *Test Construction; *Testing 
Problems; Test Interpretation; Test Reliability; Test 
Selection; Test Validity 



This study surveyed 53 master's level students in tvo 
tests and measurement classes to determine if and' in vfaat fashion 
they had.b^een victimized by testing and evaluation. Tie^urpose of 
this activity was to acquire personal data which could:;^e used to 
both dramatize and foraulate criteria ^for the construction of 
classroom tests* A majority of students felt they had been 
victimized^ and their experiences were at all educational 
le^fels — elementary and high school; undergraduate and graduate ' 
school; it the military and in higher education. These experiences 
illurstrated pJrocedural infringements related to validity^ 
reliability^ interpretation^ and administration of tests. (Author) 



Documents acquired by EHIC include many informal unpublished * 

materials not available from other sources. ERIC makes every effort * 

to obtain the best copy available, nevertheless, items of marginal * 

reproducibility ar^often encountered and this affects the quality * 

of the microfiche and hardcopy reproductions ERIC makes available * 

via the ERIC Document Reproduction Service (EDRS) . EDRS is not * 

responsible for. the quality of the original document. Reproductions *^ 

supplied by EDRS are the best that can be made from the original. * 



STUDENT VICTIMIZATION AND THE FORMULATION OF 
" "^^ST CONSTRUCTION CRITERIA 
. ABSTRACT 

This study surveyed 53 master's level students in two 

tests and measurment classes to determine if and in. what 

- — > 

fashion they had been victimized by testing and evaluation. 
The purpose of this activity was to acquire personal data 
which couid be used to both dramatize and formulate criteria 
for the construction of classroom tests. A majority of 
students felt they had been; victimized, and their experiences 
were at all educational levels ~ elementary and high schools- 
undergraduate and graduate school; in the military and in 

A 

higher edueation. These experiences illustrated procedural 
infringements related to validity, reliability, interpretation 
^and__adnixnij5tratioJi of tests. 



0$ OCPAHTMCNT OF HEALTH. 
EDUCATION A WEL^AKE 
NATIONAL INSTITUTE OF 
EDUCATION 
THIS DOCUMENT MAS BEEN REPRO 
DUCED EXACTLY AS RECEiVEO FROM 
THE PERSON OR OROANIZATION ORIGIN 
ATlNGtT POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRE 
SENT OFFICIAL NATIONAL INSTITUTE OF 
EDUCATION POSITION OR POLICY 



2 



STUDENT VICTIMIZATION AND THE FORMULATION OP^.TrEST CONSTRUCTION 

1 . ^ ■ ^ 

CRITERIA 

i 

f ^ - « 

Richard L, ?'ool^^^ 
The National College of vEducatign 

. ■■■/ ■ # 

Although the crit;eria for the selection and development of 
tests axe well knowzl and doc\iinented';^(APA, AERA,, NCME, 1974) 
^ inst^i^ctlonal activities devoted ^^^their instillntent in students 
are neither well known nor docuiuftpted (Mayo, 1967) . Indeed most 

tests and measurement textbooks/(Ahmann and Glock 1971, Ebel 1965, 

J ' 

and'Gronlund 1971) treat this^topic, but such treatment appears 

f . /)' ' 

to;be insufficient to induce students. or inservice teachers to 
employ them in their work. < 

Hence, this study was designed to establ^isb an instructional 
activity which nurtures teachers to employ test criteria in the 
construction of their classroom examinatioaff. More specifically, 
the primary objective was to describe and illustrate"^ an instruc- 
tional activity for measurement teachers, which (a) humanizes and 
personalizes for the student the identification and formulation 

1. Paper presented at the annual meeting of the National Council 

o 

I 

on Measurement In Education, Washington, D.C., April 1975. 



Of ^est construction^criteria, (b) dimensionalizes from the 
.students' experience the criteria for the construction of 
classroom tests, (c) provides the data to inductively formulate 
the test construction criteria of validity, reliability, inter- 
pretabiiity, and administrability / The secondary objective of. the- 
study was to show how additional stvuient experiahces ' and data-' 
could be acquired for the introduction, 'discussion, and treatment 
of many topics germane to tests and. measurements. 

.Method 

Two graduate (master's Level) tests aftd meastnrement classes 
at a large Eastern state university Were asked ^s an ass^^ftj^ent 
to respond to the following statement made by their instructor: 
The notion has been conveyed to me that- students 
during their educational careers sometime feel as - 
^ though they have been victimized by testing and eval-- 
uation. For next time, would you please teir me in' 
writing if and how you have been victimized by testing., 
and evaluation. 

The classes were composed of 27 men and 26 women who were^ for 
the most part, inService teachers. The number of incidents cited 
pe^tudent was from one to nine with the total number of state- 
ments produced being 129. These statements attested to poorly 
constructed and administered classroom testing or evaluation -I 

■ !■ 

instruments, misused standardized instruments, and improper grad- 



ing policies or jprocedures. In addition, these statements not 
only addressed the type of instrument imployed — standardized 
•{STD) or teacher-made (TM) , but also the item format employed— 
supp/Ly (SP) or select (SE) . IToreover, despite that fact that the 
respdndents were graduate students their statements indicated 
that infractions were sustained at all educational levels — the 

f 

* V 

elementary (ELM) and high (HGH) school (SCH) , the undergraduate 
(UNG) and^ (GRD) school, as well as in higher education (HED) and 
t]?e military (MIL) . Some of the .replies did not indicate where 
they were sustained and were classified as unspecified (UNS) . 

It was necessary to increase the areas of infraction, because 
some of ^ the situations specified on the student replies did not 
reference the criteria associated with the selection or con- 
struction of tests. Some replies were of a miscellaneous or poor 
testing nature and were accordingly given that label as a des- 
criptor for classification purposes. Other replies were of a 
social nature and featured either grading, placement and award 
issues^ or testing as punishing, as putting a person down, or as 
otherwise dehiamanizing . Iiji a manner similar to the first case,^ 
the labels assigned to these two groups were respectively, 
"Grading, placement, awards," and **Sqcial insensitivity . " 

Inasmuch as the replies not onl^ indicated the kind of inr- 
fraction and where in the student's experience he had sustained 
the situation, but also the style of the instrument and item ^ 



■ ^ . " '4 

format associated with it, it was then possible to classify 
replies- employing all four groupings. Accordingly instrument 
style was nested within the kind of infraction, and the item type 
wa^s nested within where £he situation was sustained by the stu- 
dent. Methodologically, the student replies were analyzed, tal- - 
lied and converted to percentages within a nested two-way classi- 
'fication system • 

Results and Discussion 

The results of the analysis and classification of the student 
replies are included in Table 1. Examination of the data suggests 
that at least two themes. were present during the testing and eval- 
uation situations experienced by the 53 surveyed students. One 
theme' related to what might be called technical inadequacies of 
the tests and evaluation instruments employed, and the other to 
the human or inhiaman characteristics manifested by the teacher ' 
during o£ following the testing or evaluation situation. / 

In regard to the technical matters the student replies illus- 
trated violations or i'nfr in^^ments germane to each test construe- 
tion or selection criterion. The two test criteria receiving the 
highest percentage of infractions were those of validity (23.2%) 

• r 

iand administrability (21.7%) with the criteria of reliability 

f , . 

(7.8%) and interpretability (2 . 3?^ ' receiving the fewest. 

In terms of test type, and for all criteria the teacher- 
made instrument regardless' of its educational ' level was the 



er|c • 6 



source of three times the amount of criticism as "the stanJiardized 
instrument* 

• * • 

infractions were sustained at all levels of the students' 
educational careers, from the elementary school through graduate 
school, and even in higher education and the military.' Twenty- 
four percent of the replies did not indicate where the situation 
occurred, but it should be noted that there was an increasing 
percentage of replies for the elementary school (13%) , senior 
high school (22%) and the undergraduate school (28%) . Recency' 
might account for a portion of this result,-- but for the students 
surveyed it might also be -concluded that long-term memory was 
operating. ' ^ 

The second theme wTiich appeared in the victimization accounts 
written by these students, highlighted the human element, or more 
correctly as stated by them, the inhuman element. According to 
the student accounts' nearly 21% of thelr'^teachers addedHinsult to 
injury by both making derogatory remarks and^by belittling them 
in front of their classmates. The percentage of these indig- 
nities was greatest at the high school, and this rate was two and 
a half times as great as that at the elementary or undergraduate 
school. In light of the personal sensitivity of most teenagers 
perhaps this result is not unanticipated, but for the students 
concerned it apparently did not contribute to their "self concept" 
or future achievement. 



' If one expands the test criteria of administrability to refgr 
to the teacher's demeanor during and following the testing or eval- 
. ^ nation situation, then the categories of administrability and social 

insensitivlty could be collapsed together. The result of this for 
the group surveyed would be that the primary are^ of infraction would 
no longer be the highly technical criteria of validity, but the less 
technical and more practical criteria of test administrability. And 
the conclusion, would be that about half (43%) of the noted testing 
and evaluation' infractions could be eliminated by communicating to 
teachers the need for being courteous and thoughtful during the 
administration of classroom tests and whqn it is being -reviewed. 

Another result was that six of the 26 women surveyed, or five 
percent of the group, reported 'that as far as they could recall, they' 
had not been victimized by testing or evaluation. 
^ Further inspection of Table 1 also shows that' 17% of the stu- 

dent replies referenced the issue of grading i placement, or awards, 
and that the incidences of infraction when comparing teacher-made 
s to standardized instruments is nearly three to one. As recalled 
by the student^, the specific nature of some of these infractions 
is shown in Table 2. Additionally^ Taole 2 also contains other 

■ v f ■ • ' • 

unedited student replies for each of the categories. From the 
sjtudent*s perspective, the abusivenes^s of these replied is evident, 
and janknown from the teacher's, yet the point is to be made that 
greater understanding and improved use of measurement techniques 
O in education must include the human element. 

ERIC 



Writing in the final* report of a project entitled "Pre-Service 
Preparation of Teachers in Educational Measurement," Mayo (1967) 
indicated that teachers need to acquire competency in the construc- 
tion and evaluation of classroom tests • He added that "measurement 
teachers should contrive more ingenious w^ys to demonstrate the 
ultimate usefulness of certain competences as they are being learned, 
"and that test and measurement courses could be improved by devel- 
oping more meaningful presentations of material* 

The activity described here permits the student to glean a ' 
perception of test construction and selection criteria which appar- 
ently has not been made by the usual textbdok-lecture approach. 
The perception is drawn from the student's own experience and syner- 
gistically enlarged from- the experiences of his peers* The extent 
of the personal meaning achieved from this activity appears in the 
seXf criticism which students make following the construction and 
analysis of their own teacher-made achievement tests (for a class 
they are teaching) . 

. It is unclear to the investigator whether this activity could 
be called ingenious, but its meaningfulness to^ the student^ is cer- 
tainly clear. As one student wrote at the end of his reply: 

* "A teacher should recall any personal detri- 
ment suffered due tq testing inconsistencies 
and analyze them for guideline to be used 
in the construction of more meaningful tests." 



8 



CO 



o 



U CO 



(0 CO 



CO 

u 

CO 

r-l 



U CO 



1 



M CO 

a 
c 

id. 



CO 



a ■ e 

CO 

c 
6 

U CO 

z 

O H 
O CO CO 
CO c 

M 



0 

2 



4J CO 

c z 

<u 
6 

O CO 
(0 



t 

O CO 

o <u 

CX4 H 



^ CO 
3 
CO 



CO 



CM 



CO 
CM 



O 



CM 



CO 



00 



CM 



CO 



CO 

z 



CM 

« 

CM 



M PL4 CO 

CO CO z 



CM 

<n 

<n <f <f 
• • « 

CM m m 

<n r>* r>* 



CM 



M PL4 CO 

CO CO Z 



m 

CM 
CM 

CO CO o 

• • • 



O O ON 



CO 



CM 



CM 



i-l <r CM 



W PL4 CO 

CO CO Z 



ON 

r> 

CM 

o m <r 
• « « 

CO CM 

1-4 

ON i-l \0 



CM 



>^ >^ >^ 



vO 

« « 

CM CM 



9^ 



CM 



CM 



CM 



CM 



M PL4 CO 

CO CO z 

o 



M PL4 CO 

CO CO Z 



CO 



CO 



tico CO 



CO 



<o 

CM 



ON 



r-l CO 

o 
ON r .-^ 



<o 



CO CM 

• • 

<N <0 
CM 

>^ 
in 



o m 

CM • 



I-l CO C5» 
CM 



CM 



CO r-l r** 



O r-l 
r-l CM 



ON 

<o 

CO 

o . 

r-4 r% 



CM 
r-l 


<0 ON 




ON O 




CM 




<o 


<o 






CM 




<o 


CM 




r-l 


ON 



<r o 



CO 

r-4 m 

CO 



<o 



C0.<0 
CM CM 



Z 5^ ^5 



NO 



NO 



13 



u 

09 



O 

c 



c 

CO 



CO 
r-l 

<a 
o 



u 
c 

o 
c 



CO 
4J 

c 
u 

04 



o 

H 

o 

<3 



o 

<u 

o 

CO 



u 



3 



o 

CO 

4J 

c 

<u 
u 
u 
<u 

c 

c 

<u 

C7* 
<U 
M 



CO 



ERIC 



10 



TABLE 2 . 

Illustrative Student Replies, by, Category 



Category Reply 

r 

Validity 

1. During a psychology course* recently I was "victimized" a were 
the other students by the evaluation method used- 'li'he prpfes- 

-\ ■ ■ ■ 

or lectured to the glass on the material -Which probably was 
most interesting to him or which he could add his ovh comical- 
sarcastic adlibs. His classes'^were enjoyable becau^eNhe did 
entertain well. ^ When it came to testing, however, he went com- 
pletely to the book and chose the smallest'' most factual mater- 
ial that he could find for at least 60% of the tes*t. There 
was little correspondence between class material and testing 
material. 

2. My first response was - "iWe never been victimized sby a test. 
I always did w^ll^ and therefore hav^ nothixw to complain 



1 

about "But the more I thought about the question, the more 

I realized that I wasn't victimized by the tests themselves * 

^ because I knew how to play the game. Thats^ is, I knew how to 

memorize ^nd was an absolute ace at recall. Now, as I have 

» 

thought through the question and its implications,*! feel a 
cold anger not only in regard to classroom tes^ng, but to the 
philosophy and the objectives behind the tests. ^ Evaluation 



11 



consisted of testing* abilitj^ at recall and memorization 
^ it did riot test ability to think, to solve problems, or to , 
apply knowledge in a! creative way. , / * 

3. Styles come and go in education and so the "progressive" ^ 
principal of my public school in city'x in the 1920*« de- 
cided to give intelligence tests to all the elementary pupils. 
^Because I did well with good visual memory and the inner-moti- 
' vat ion to do well, they^depjf^d to skip me several half grades 
iri the lower grades. Both of my parents^ had been teachers and 
saw nothing wrong in this procedure since I was capable of 
" assimilating material and was serious pxjut school marks, etc. 
However, I was the slow maturing type gnd this advancement in 
school threw me sadJ|y out of pace with my fellow students. 
My social development got stuck, I became a "Ibner" with the 
exception 6f one friend with whom I did many things outside \ 
of school before graduation in eighth grade. I had tirouble 
. joining in with my peers all during high school and e^en an 
. extra post graduate year in high school to get me to the age 
^ of 17 before entrance intoj:ollege did not Ihend the gap soci- 
ally. I ^ » 

Administrabildty . > » ^ . 

4. I 'don't reLember' the exact tfmes, but the following situation, 
has happened to me more than onc^. This '^s the* time. when the 
teacher is "kind" enough to tell you. -what to study, throws ^ 
. > * 

12 



, . -- 11 

you a curv.e, and you do not have any idea what the test or 
final is about. ' •. , 

5. On the high school level, most tests seemed pretty fair ex~ 

cept that their directions were often vague. Most students 

just assumed a set way of answering questions and then ans- 
ae 

^ wered them without paying too much attention to the directions 

6. As a student in grade school, I was "victimized" by classroom 
testing as follows: I was a"consistent!ly good student, but 

, one day the class got back some arithmetic tests and my 

grade was P. it seemed unlikely to my par^tlt^s so we checked 

the problems and many mafked wrong wer« seemingly correct / 

/- 

Reviewing it with the teacher, it was discovered that I ' had 
copied numbers incorrectly from the board. (As it turn^fe out 
we discovered I was quite nearsighted. , ) The testing was un- 
fair since problems written on the' blackboard in a large room 
may not be equally clear to all students. 
Social Insensitivity " > . 

7. First of all, remembering back, my first' experience of this 
was*- being threatened by a teacher that we would have to take 
a series of tests unless vfe started to beh|^e outselves. 

3. ... a list of English words were 'given and we were asked to 
make sentences with them. I remember -mistaking the word 

^ "shrivel" for "shiver" and made the sentence: "Don't shrivel 
with coal (for cold) wear your sweater." The teacher copied 

t 

13 



the sentence with my name bn it and pasted it qn the bulletin 
board. The other students made fun of me and were very cruel 
, apparently without meaning to be. But after that incident^ I 
never wanted to write anything for this particular teacher. 

9. ^ ... When • I inquired into why I received a low grade, the 

response was "Anyone who receives an "E" (failing grade) on 
any exam, never deserves a grade of "B" in m^ course 1 
Reliability 

10. When I was in the 9th grade my history teacher asked a true- 
false question I .felt was ambigious. I was a good student 
and the incorrect answer did not effect my grade, however, I 
was annoyed because it didn't tap my knowledge of the subject 
but rather my interpretation of what the word important meant 
as compared with the teacher's interpretation of the word. 

In this case, neither my understanding or factual knowledge 

of the subject were tapped by the question. This happened 

ajfdozen years ago, yet, I can still remember the question and 

how and why I answered it ''incorrectly". 

11 • ... Other times of being victimized by a test occurred 

when some factor (external or int^egrtal) has distracted me to 

such an extent that I was not able to perform as expfected. 

« 

Such distractions have taken the form of noise, being slightly 
ill, et^c. This distraction has also been of the form of the 
make-up of the test. This happened when the test activity 

14 



er|c 



was unfamiliar^ that is not of the form that was used in the- 
instruction* " - . ^ - 

Interpretability 

12* . . . The test wasted the person's time in taking it, was un- 
fair to the student who worked 'hard, to get a good grade, and 
gave no information to me as to whether I really learned the 
material required in the course, 

13. • • • no feedback* 
Gracing 

14. I received. a '*99" on my 8th grade English final exam* My 
teacher said that I answered all the questions correctly but 
she had to deduct one point from my composition because she 
didn,*t feel that anyone deserved a perfect grade in English 1 

15. My example deals more with being victimized ^s a result of 
evaljaational procedures rather than actual testing, although V 
testing did play a role. During my work as an undergraduate, 

I was required to take a course in Statistical Psychology in 
order to take a course that I wanted to tajce in abnormal 
psychology. The course was designed for psy majors^ and was 
in some sense used as a basis for weedifig out the unwanted ex- 
cess* Being a math major, I felt that I would have little trouble. 

the gra<3e was determined by evaluating performances on three 
hourlyexaminations, one final and a lab project. On two of 
the hourly exams I scored A, on the third C. Seeing that we all 



16 



strive for that "A*, -I sought advice from the instructor. I 

« 

was told that all I needed to do was"* to score high on the 

« 

final and get an A on the project (the project being more im- 
portant) . My project was graded A, and my final A, but some- 
how I was given a B. When I returned to question the grade, I 
fouhd that the instructor had leftfc^ the summer. Ah assistant 
offered an explanation base^;^^onj^"'Cu"€;%f f " points. You see, 
they could only awaj-,^^certain nuitiJser 'IIe A's. 

16. . . . One practice that is very bad ,.\iS)J5hat of reserving the 
high grades for majors. * '•'.'.f; 

I 

17. • . . I was then- flabbergasted when she (the^ teacher) nearly 

failed me at the fin^al test in which I had put up my best per- 

\ 

formance. When I went to find out why I was graded so low she 

* ■> 
said I had been cheerful in class and she did not like cheerful 

people. . This instructor in her dealings with me at this in- 
stance let her emotions dictate for her how to evaluate my 
work. ^ 
Poor Testing 

18. I'm not exactly sure exactly "how" I was victimized through 



testing and evaluation procedures in my past. ^ 



\ 



That I have been victimized, tj^e^re- is no doubt, for i\ reading 
the yellow booklet ( 

m Te^.^ ^ 

Procedures for the Claswrfoom Teachei"^" The State Education ' / 
DEpartment, Bureau of Examinations * ahd Testing, Albany, 1958. ^ 



IMproving the Classroom Te^t, A Manual of Test Cons'tructioX.; 



/ 



on test construction "l tind that my own* tests contain many 
faults when anflyzed^ according to the checklist in the rear 
of. the book. I was never taught how to construct tests, there- 
fore I suppose I constructed them partially by^"common sense" 
and by the influence of tests I took in the past (condition- 
ing, if you like) . 

• * i 



17 



; 16 



erJc 



References 



Ahmann, J.S. and GlocK, M,Pw Evaluating Pupil Growth . 

♦ *■ 

' {4th Ed.) Boston: Allyn and Bacon, 1971 • 
American Psychological Association, American Educational 
Research Associate, National Council on Measurement 
in Eudcatioh, Standards ^or educational and Psychological 
- Tests. Washington: American Psychological Association, 1974. 
Ebel, R.L., Measuring Educational Achievement. Englewood 
Cliffs: Prentice Hall, 1965. 

Gronlund, N.E., Measurement and Evaluation in Teaching. . 

1 

(2nd Ed.) New York: Macmillan, 1971. f 
Mayo, S.T. Pre-Servige Preparation of Teachers In ' j 

Educational Measurement . Chicago: Loyola University, 
December 1967. 

f 

Project 1J) 5-0807, Contract No. OE 4-10-011," U.S. Department 
of Health/ Education, and' Welfare. 



18 



